Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning
نویسندگان
چکیده
Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users’ personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users’ feedback. Users’ feedback can be positive and negative and both types of feedback have great potentials to boost recommendations. However, the number of negative feedback is much larger than that of positive one; thus incorporating them simultaneously is challenging since positive feedback could be buried by negative one. In this paper, we develop a novel approach to incorporate them into the proposed deep recommender system (DEERS) framework. The experimental results based on real-world e-commerce data demonstrate the effectiveness of the proposed framework. Further experiments have been conducted to understand the importance of both positive and negative feedback in recommendations.
منابع مشابه
The Effect of Pairwise Video Feedback on the Learning of Elegant Eye-Hand Coordination Skill
The present paper aimed to study the effect of pairwise video check feedback (including the observation of external pattern of skill performance and performing the skill simultaneously) on the learning on acquisition and learning of eye-hand coordination skill. Computer skill of eye-hand coordination skill was the tool used in this study. 24 subjects were randomly selected and equally divided...
متن کاملExplore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds
We describe a method to use discrete human feedback to enhance the performance of deep learning agents in virtual three-dimensional environments by extending deep-reinforcement learning to model the confidence and consistency of human feedback. This enables deep reinforcement learning algorithms to determine the most appropriate time to listen to the human feedback, exploit the current policy m...
متن کاملReinforcement Learning from Demonstration and Human Reward
In this paper, we proposed a model-based method—IRL-TAMER— for combining learning from demonstration via inverse reinforcement learning (IRL) and learning from human reward via the TAMER framework. We tested our method in the Grid World domain and compared with the TAMER framework using different discount factors on human reward. Our results suggest that with one demonstration, although an agen...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملWeb pages ranking algorithm based on reinforcement learning and user feedback
The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.06501 شماره
صفحات -
تاریخ انتشار 2018